IJCA.SPE[E85,JMC] - www.SailDart.org

perm filename IJCA.SPE[E85,JMC] blob sn#806936 filedate 1985-11-06 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	@make(text)
C00053 ENDMK
C⊗;
@make(text)
Good morning.  I am Ray Reider and it is my pleasure to introduce to you
Prof. John McCarthy who is the first receipient of the JJCAI award for research
excellence. This award, which will be given at every subsequent IJCAI honors an
AI researcher who has made several major and sustained research contributions to
the field.  It is difficult to imagine a more suitable first winner of this award
than John McCarthy since he himself is associated with so many firsts in Computer
Science in general and in AI in particular. He made the first proposals in the
late 50's for time-sharing computer systems. In 1958 he invented LISP and later
directed its first implementation at MIT. He was a pioneering advocat of reasoning
about computer programs and proving properties about them and he defined a formalism
for presenting recursive programs which became the theoretical basis for LISP. In
1960 John McCarthy made the first proposal for use of logic and knowledge
representation. Out of this early research in logical foundations for AI and in
later work with Pat Hayes came a broad range of influential research topics: the
frame problem, the situational calculus, planning and dynamic worlds, reasoning 
about causality, and many others. His most recent contribution to AI is in the
logical tradition which he founded. This is his circumscription formalism for
non-monotonic reasoning designed to formalize a wide variety of common sense 
knowledge which conventional logic approached to AI were unable to formalize.
In addition to his distinguished research career, John McCarthy has been a 
major influence on the evolution of AI as a discipline. He is universally 
acknowledged tp be one of the founders of the field and even to have coined
the name Aritificial Intelligence. While on faculty at MIT from 1958 to 1962,
he, together with Marvin Minsky, organized and directed the MIT AI project.
After moving to Stanford, he organized the Stanford AI Lab and served as its
director from 1965 to 1980. It is with great pleasure, therefore, that I present
to you, John McCarthy, the certificate honoring you with the first IJCAI award
for research excellence. Professor McCarthy will then tell us about the easiest
unsolved problems of AI.

Last year I spoke to a similar audience and in a certain sense, which I will
explain, I wish I was in Siberia; namely, once I went to Siberia to Nova
Sibirsk and gave a lecture there and when I came mentioned this to a friend of
mine, Lloyd Shaply, and he said 'oh, if you haven't got a new lecture get a new
audience'. So, Ray has done his best to convince you that I deserve this award
and it would be smartest of me to go away now and not spoil his efforts; but I
hope I won't. 

I had considerable difficulty preparing for this, because of the title that I
chose The Easiest Unsolved AI Problems. I will mention some problems, but they
aren't easy and furthermore, which is somewhat worse, they are not very sharply
defined. However, first I want to discuss the tradition of approach to Artificial
Intelligence that I am following and had better stick to and this late date. One
can consider two major approaches and the first one would be to understand and
simulate human intelligence; even that one splits up; you can try to understand
it at the physiological level, in terms of nerves and so forth, but unfortunately
physiology is not yet sufficiently far advanced; or you can try to understand
it at the psychological level. But the tradition of AI that I am interested in
takes a quite different view and its objective is to understand scientifically
the relations among situations and goals and the actions required to achieve
the goals; and the key problem in this is the question of what information is
available of the world that is useful in solving goals and how do we make the
computers manipulate this information. These two approaches to Artificial
Intelligence are not so disjoint as it might seems, because, of course, we who
are interested in any methods that will work find ourselved looking into our
own heads and saying 'well, what do we do, and listening to the psychologists
when they try to tell us what people do.  

The approach that I am going to mention really starts in, I date it 1958, although
the paper Ray mentioned was actually printed in 1960, but I kind of think about
when I gave the talk, and so I would like the first slide.

(They tell me that if I leave any slide on too long, they will catch fire)

We want to use mathematical logic then as our key source of languages; I don't think
we can call mathematical logic a single language and to represent the facts of what
we know about the effects of actions, what we know about the goals we want to 
achieve, and kind of general principle that one should do what achieves one's goal
and also facts about the particular situation. And then the proposal of that paper
was that we should logically deduce a sentence of the form should in action and
then do it; and I still adhere in the main to the program of that paper, but
it turns out that logical deduction has to be supplemented by non-monotonic
reasoning and I will say more about that one later. One of the major areas of
research in AI has been, especialy the formel part, the formalizing the
effects of actions and even in the non-formal part, under the heading of 
planning, goes the given some facts about the effects of action and about the
preconditions for performing various actions finding a sequence of actions that
will achieve a given goal. This perhaps being a subcase that we are still mired in
to the more general problem of finding a conditinal strategy that will achieve the
goal. One of the major formalisms for this had been the so called situation 
calculus and the main formulas of the form 'new situation is the result of an
event occured in an old situation'. This has to be looked at somewhat critically
because it is a very partial formulation of the results of actions. Only some kinds
of actions and events can be formalized this way, namely, when we regard the actions
as being discrete in charater and having kind of definite effect; but if there are
concurrent actions going on, then the situation calculus, at least in the forms
in which it has been studied in the literature, does not satifactorally formalize
it.  

Even within this framework two problems arose. One of them is the so called
frame problem which is the problem of not wanting to put in a very large number
of axioms about what things and actions does not affect. Here we rely a bit on our
human experience and it tells us that it is quite counter intuitive to have to 
include an action turning over one of my index cards does not affect the fact that
I am still in Los Angeles. And it is of course clear that you would have an 
enormous combinatorial problem if you did that. In my present view, the frame
problem is substantially solved and I am going to later mention one of the non-
monotonic formalism that seems to me to solve it for the time being. But I 
wouldn't be surprised if it were to recur and this is also true of the next
problem listed on the slide. To a somewhat lesser degree, the qualification problem
which concerns the qualifications one has to put on actions. 

(I am going to turn this over and hope that it will go dark; I am not going
to use this slide for a while, so don't let it burn up)

I have worked on the connections of logic and AI I want to mention what some of
these are. There are certainly quite a lot of them and of different characters.
1958 there was this proposal for programs with common sense and as I mentioned
this involves trying to formalize the facts about the world in logic and to have
a system that would deduce what it should do. Another thread of AI and logic is,
of course, automatic theorem proving which takes the problem of proving mathematical
theorems as an AI domain and it is certainly a domain that has proved quite
difficult that in which to some extent people have backed off from some of the
more original goals and have considered human aided proofs where the machine does
part of the work and the peole supply it with lemmas and things like that. In my
view the most outstanding work in that area has been done by Woody Bledsoe, by
Bob Boyer and Jay Moore and by Larry Wasp. Maybe I am not even really competent
to say who has done the best work in that area. The situation calculus has had
a number of descendents and stream lining, one of which is strips and then the
next thing that I should mention is logic programming and let me remark that
there are other logical resources that are available to AI but which have been
less used and one of them that people are trying to use is modal logic which
is useful for talking about beliefs and knowledge to be able to say things 
like 'travel agents know airlines schedules; there are a whole batch of unsolved
problems there that people can work on. I guess that sharpest of them concerns
being able to put databases information about what information is and is not in
other databases and what various people and programs know how to do. Another area
of logic which AI hasn't really gotten into is axiomatic set theory. Now axiomatic
set theory is very powerful; it is more powerful than higher-order logic even 
though it is formalized in first order logic. You can the greater power, of course,
because you put in a whole batch of axioms, the Zermalo-Frnaklin axioms. I should
mention for those that are not familiar with set theory that this is the formalism
that allows sets and sets of sets and sets of sets of sets up to as higher an
order as you like including infinite order. Just to mention one thing. The
famous introduction of the ordinal numbers and including the ordinary numbers
where zero is this null set and any ordinal number is the set of all smaller
ordinal numbers.  This is a very powerful system, however, there is one major
problem before we can use it and that is that one of the major constructions
of set theory is to form the set of things that have some property.  If you just
allow yourself to do that entirely freely then you can form the set of sets that
are not members of themselves and you immediately arrive at a paradox but the
set theorists restrict tha. The simplest form of the restriction is to say
'you can only take things out of a pre-existing set.  And that is quite powerful
and if you want to make set theoretic proofs, you use that principle a lot, but
nobody has as yet been able to figure out how to control a computer in using that
principle. There is just an enormous number of sets that can be formed and how
the computer is to decide which one is useful is the key thing.  

Some other relations between logic and AI. There is non-monotonic reasoning
which I'll talk about later; there is the whole areas connected with the logic
level, our description of systems. Newell in his first AAAI presidential address
talked about that. He said that even if you aren't directly using logic in your
program, you can use logic to describe the program and also for it to describe
what it knows about people in the sense of ascribing knowledge or belief. The
topic I want to discuss a bit is algorithm plus control.  

Over the course of time, a number of concepts have been introduced and a number
of ideas and I want to mention some of these. The first one I want to mention,
that I have been pushing for some years is that in Aritificial Intelligence
we can separate the problem often into heuristic and epistemology. And heuristic
is concerned with search, with pattern matching in general with avoiding 
combinatorial explosions, whereas the epistemology, taking that to mean what
the philosophers mean by it, the theory of knowledge, is concerned with what
what knowledge there is about the world. In my view it has turned out that
up to now the epitemology which really ought to be done first in any particular
problem has presented very great difficulties, so that applying heuristics
to certain classes of problems has been hampered by the fact that we haven't 
really solved the question of what information there is and how it is to be
represented. 

Associated with that is the proposition put forth in the 1960 paper that Ray
refered to that most of the information that humans use, the humans transmit
from one to another and also that computer programs should use takes a 
declarative form. It consists of facts rather than algorithms and in fact even
when we are instructing another person in how to do something, what we tell
them is mostly facts and then we rely on their reasoning ability to do a
substantial part of deciding what to do.

Another concept that I want to mention in that connection is the concept of
epistemological adequacy of a formalism; namely, the formalism must be
adequate to express the information that is actually available. To give an
example of that, if we are interested in the state of the quantity of gas,
then physically adequate formalism is the positions and velocities of all
of the molecules of the gas. However, that information is not available
nor could we do the computations that are required with it. If you were 
interested in the question of what would happen if I were to spill this glass
of water, then you would, rather than drink it, you would not be interested
in using the equation of hydrodynamics because you wouldn't have the initial
condition and if you did, you couldn't do the computations but you have this
knowledge of common-sense physics which is sufficient to tell you that the
people in the balcony need not worry about being splashed.  That turned out,
in my view, an important concept. We have to consider what information is
available to the reasoner whether it be human or robot. Let me point out to you
that while you could imagie that would be some hope if you had really a cray(?) and
you had some really super television cameras of applying the laws of physics to
the spilling of the water on the podium still I would believe that even today
if you were to undertake a project of making the necessary cameras and the
necessary computations so you could find out where the water would land. It would
still be a couple of years research and even after you took delivery on your cray.
Common-sense physics turns out to be a very important topic here.

Another topic which has thread of things which takes place in AI is what one
can call reification and this is again a word taken from philosophy where it is
often epithet. People at times accuse each other of reifying and what reifying
means is 'making a thing out of' and I don't have any, should have a good
slide for this but I don't, turns out that for AI purposes reification is often
very important. To give an example, you can either say: behind McCarthy
podium or you could say true, behind McCarthy podium in a certain situation
and in the first case we have this predicate behind and its two arguments
'McCarthy' and 'the podium' and in the second case, we have a function
'behind' which forms this object out of 'McCarthy' and 'podium' out of these
symbols from McCarthy and podium. Well, you can ask 'why do you want to do
that. The answer to 'why do you want to do that' is if you want to do certain
kinds of general reasoning; suppose we want to say that 'Tom believes that
McCarthy is behind the podium and then we want to make the general remark that
whatever Tom believes George believes, or George is willing to believe, then
we need a quantifier that runs over propositions and the way to do that seems
to be to reify. 

(We can have a slide now)

A key problem that I think should guide a lot of the work in formalization
is to be able to get a formalization which is adequate to construct a common-
sense database and this is a database of common-sense knowledge that could
be used by some wide variety of programs.  So we would like to have in there
in it a common-sense physics that would enable it to determine in a particular
case of what would be the consequences of my dropping the glass of water etc.
And hopefully the database should be constructed in advance of decision is
to what specific applications it should be used, and it should end up not being
ad hoc and not having too much information in it.  My opinion is that no one has
yet got an adequate formalism for doing a common-sense database.  Let me
illustrate by one more example the requirement for generality and this was already
an example which appeared in that 1958 paper. In that paper the question arose
of the possibility of driving a car from one point to another and you could
imagine that in the database there were a large number of facts that say 
drivable Pasadena-San Pedro, to take two places in this area. But that becomes
quite implausible because the number of pairs of points is too large, you can
hear of a new place and ask 'is it possible to drive from this place to some
other place that you also just heard of and you'll give the answer 'yes', and 
so the information has to be represented in some much more general form like
'between any two towns in California it is possible to drive. 

Now I want to mention a discovery that was made and this discovery was made
by various people in some sort of interaction and I can surely get myself
into big trouble if I try to disentangle who did what; but, anyway, the 
people who were involved are Robert Kowalski and Alan Comer...(?) and Pat
Hayes and mumble (the mumble was the person that I should have clearly have
also mentioned) and has to do with logic programming and let me formulate the
discovery in a way which I think is the most significant.  It had two aspects;
one is that Horn-clause logic can be used as a programming language and the
other is that certain kinds of facts can be representative Horn-clauses and
certain uses of these facts can be made by PROLOG-style execution.  There are
two parts of it. One is that certain facts can be represented and the other
is that certain kind of uses can be made of these facts and one of the points
which I want to emphasize that these are two different things and let me give
an example of that because it presents a kind of the unsolved problems that
I promised to mention and consider the fact that a container is sterile if
all the bacteria in it are dead. This fact can be represented as fragment of
PROLOG program and as such it can be used, as the PROLOG people are fairly proud
to announce, in two ways.  One, it could be used to determine whether a container
was sterile, as a fragment of PROLOG program and it would suggest doing that
by looking a each bacterium in the container and seeing if it was dead.  It
could also be used to sterilize the container, again by finding each bacterium
in the container and killing it.  That fact, however, is not used in this way.
That is, it is not used by direct execution. It is instead used, for example,
to rationalize sterilization by heating the container which kills all of the
bacteria and so one might say the formula is being used in a quantified form
and when take 'for all x, x is dead' in the container is being used to reason
about the effects of heating the container and it is not merely being used
to specialize to particular bacterium. Similarly, you test whether a container
is sterile by putting some of the bacteria out on to a culture medium and you
say 'if there are any bacteria still alive then you will get some colonies.
To try to formulate this as questions, we can ask 'what are the limits of PROLOG'
What uses of facts require something other than PROLOG style execution of 
Horn-clauses. That was a mere example.  The question also arises in very active
fields of research, can PROLOG be significantly extended to represent new classes
of facts while still retaining its executability and the answer is that it 
certainly can but of the various extensions that have been proposed it's not
clear that they have the general utility of the original formalism.  

In this connection, we get another batch of problems which come under the 
heading of the control of reasoning. I have thought about this a little bit 
in connection with PROLOG, how one can add explicit control to this. Bob 
Kowalski made the slogan that algorithm is logic plus control. So, presumably,
this suggests that we can split up a problem by expressing in the logic the
facts that determine what should be done and then in a separate control part
saying something about how these facts are to be used.

Recently I have thought about this slightly differently than I had before, that is,
before I was thinking about adding control to PROLOG programs, and of course lots
of other people have thought about that and devised mechanisms for it. But now
what I am thinking about are saying let's take a more declarative effect approach
and consider facts that are relevant to controls and ask that some program, for
example, something that thinks before it executes to use these facts.  One area
that I've found you can do something with is map coloring. So you would like to
color a map with four colors and one takes the following observation which was,
I guess, due to Campi in 1879 that is that if you have a country that has three
or fewer neighbors then no matter how the rest of the map is colored, there is 
a color for that country, there is a consistant color for that country. What
I would like to do would be to simply supply the fact in that form and have it
suitably used.  Here is how it could be suitably used, and we would like the
program to do this automatically. If you have a map to color then you can remove
all the countries that have three or fewer neighbors and say 'let's color the rest
of the map, because there will be no trouble coloring these countries once we've
done that'.  That's fine, that's the kind of reduction step but the remaining
map, after all countries that have three or fewer neighbors have been removed,
may itself have some countries with three or fewer neighbors, because, of course,
removing some countries removes the neighbors of some of the remaining countries
and so what you really want to do is repeat this as long as possible to get
a completely reduced map in which every country has four or fewer neighbors, four
or more neighbors, sorry.  If you take the map of the United States, or the
map of Europe and Asia, or any other map that I've been able to actually find,
that is, any map on the surface of the earth, the completely reduced map is
the null map.  So, if you apply it to the United States you strip off California
and Main and Florida and North Dakota, I think in the first step, strip off
Arizona and so forth in the second step. After four steps you are left with
only Kensas.  One can think about that then in two ways. One is saying, well,
alright, let's add control, the PROLOG program to color maps is extremely
trivial, so you can talk about, but it will take a long time if you just let it
run brute force, but you can talk about adding this kind of control to it. But
I'd like to do something more fancy. The fact relevant to the control. It turns
out that one can say more with regard to maps, namely, if you have a country
that has four or fewer neighbors then you can also leave it to last but of course
it might turn out when you want to color this country with four neighbors, then you
have already given different colors to all four of its neighbors. Then what you
have to say is that there happens to be a easy way in which you can modify the
colors without a lot of search that you've already of the colors of the countries
you have already colored (?) to be able to color that remaining country. 

Another area of AI which presents problems is the area of expert systems which
are mainly based on pattern action rules and one is interested in the limitations
of AI systems based on pattern actions rules and the systems which I have studied
in that connection is MYCIN and it is not too clear which of its limitations
are due to the fact that it is programmed in pattern action rules and which
are due to the specific tasks that they undertook, but it does have some
weaknesses that I want to mention extremely briefly.  One is that is doesn't 
offer any kind of prognosis and doesn't do any planning and that because it
is expressed entirely in terms of rules that if these are the symptoms then
the results of tests then this is the diagnosis and this is the proposed
treatment. But it never goes through any reasoning of the kind of this particular
antibiotic can only be used in someone who does not have a fever so let's give
this patient aspirin to get and check to see when we've got rid of his fever
and then give him the antibiotic.  

I have just gotten into panic mode from looking at the clock. That will account
for the change in style of presentation.

There are all kinds of extensions that one can make to the blocks (?) world
problem and one I would like to recommend people to think about is: Suppose 
we add to blocks world the concept of tower and we consider that one would like
to design a tower and let us say we say a traffic light is a tower of blocks
that has a red block on top of a green block on top of some other blocks and
then we would like a system that would design first and make a construction
plan later. We would like to be able also to talk about incomplete towers and
broken towers and actions of finishing the tower. 

Now I am ready for a slide. Let's skip this one, one to the next slide.

I think I am going to go through this thing much too rapidly to understand
and hope that many of you have been studying various forms of non-monotonic
reasoning and I mention in the slide the property of monotonicity that ordinary
logic has that when you increase the premises, you increase the conclusions and
it turns out that ordinary human reasoning is often non-monotonic and there
are these three formalisms mentioned in the slide.

Go into the next slide.  

This just to show you that there really are some formulas. This is the formula
for circumscription general definition of it you will not that it has that
for all p' and where p is a predicate variable and so therefore it is a second
order logic formula, and I am just going to go on ruthlessly.

This is the alleged solution of the frame problem that I mentioned earlier and
it uses this notion of abnormality and it is easiest to read it if you imagine
if you drop all the sentences that have 'not abnormal' as part of their thing,
that predicate ab stands for abnormal and this says first fomula then says
that the location of a block after an event is the same place it was before.
Likewise, the color of a block is the same place as before unless there is
something abnormal.  Then the further rules describe when things might be abnormal
and then non-monotonicity comes in after you have the description of the situation
is you try to minimize abnormality so this in fact is the logical analog of 
somethings that you have in physics which are the minimum principles and it is
sort of interesting that there may even be minimum principles in logic.

Next slide, please.

More of that same thing. Next slide.

We have here this principle of rationality which Newell has discussed quite
a lot and this principle can be elaborated but it also can be shrunk. One of
the key problems of the use of non-monotonic reasoning is to accomplish these
shrinkings and each of these English sentences is to be taken with a grain of
salt that is cataris parabis clause (?) it will achieve its goals unless something
is abnormal. It will do what it will achieve its goals unless something is
abnormal and even the original one also contained this unless something is
abnormal but what may be abnormal is to be expressed by further axioms.  

Next slide please.

(SIDE 2)

This represents a kind of an issue in AI and you see, maybe I don't have
to add anything verbal to the slide.

Next slide. Actually I would just as soon leave the screen blank for a bit.

To return to the question of problems. There is the problem that this formulation
using abnormalities has a  lot of limits and one would really like to get rid of
the abnormalities and I shall mention the technical problem which arose in 
connection with the abnormlity formalisms.  One of the chief examples for non-
monotonic reasoning is the question of birds flying and what are the exceptions
to this and it is easy enough to put in axioms into the abnormality formalism
that it says unless something is very abnormal, dead birds don't fly.  And that
would be fine for the axiomatics of birds but what you really would like to
get that out of is something far more general. That unless something is very
abnormal dead objects don't do anything or dead animals don't carry out their
normal activities when they are dead and how to get it out of this more general
thing without introducing, one might say, sneak possibilities is a somewhat 
difficult problem.

Now I want to mention another problem which is called ambiguity tolerance.
We would like systems that are ambiguity tolerant and let me give an example
here. There happens to be a law against attempting to assault a federal officer
and people have been for number of years indicted for this and some were 
convicted and so forth; sorry, conspiring to assault a federal officer. And then
somebody offered the following defense.  He said: You haven't proved that my client
knew he was a federal officer; it is true that my client conspired to insult this
guy, but may be he didn't know he was a federal officer.  The other defense, that's
a kind of de-dicto defense, to use the philosophers jargon; another defense is
the following: Although my client thought this guy was a federal officer, he wasn't
really and so if the fellow that my client was conspiring to assault was not a 
federal officer, even thought my client thought he was, what about that. These
cases went against the criminals both ways in the actual court.  The third 
possibility that no lawyer has advanced is suppose that somebody puts an 
advertisement in the Criminal Gazette saying that he will pay $5,000 for the
scalp of any federal officer. That should be criminal all right but you can 
offer the following defense: Conspiring to assault a federal officer exhibit
a federal officer that surely then there must be a federal officer that my
client was conspiring to assault. So, here is the point about this as a problem
for AI and I think that this is the problem I am going to finish with. 

Suppose that you are doing a database for the district attorney and this database
was to be used by a program that would decide whether to ask for an indictment
for some crime. Then must you have thought of all of these things before you
built this database? And the answer is better be 'no' because the philosophers
have been spinning this kink of thing out for 2,000 years and seemed perfectly
capable of continuing it for another 2,000 years. It shouldn't be necessary
to solve all of the problems of philosophy before one can do AI. What this means
is that the AI systems must be ambiguity tolerant. This phrase I got from a
philosopher, namely, Hubert Drefuss who considers that thereby that AI is
impossible because computers can't be ambiguity tolerant. But it turns out that
through the wonders of non-monotonic reasoning, you can in fact make them
ambiguity tolerant; at least it seems so, but saying something like this:
That the law is unambiguous except where an ambiguity is actually found. You
minimize ambiguity and therefore it will turn out that the system will hopefully
behave like a human in the cases where the issue isn't raised or doesn't
arise like there was no doubt this guy knew in every case that the guy was a
federal officer and he really was so forth and so on. Then there won't be any
problem, but nevertheless to kind really put this ambiguity tolerance there are
many problems to be solved.  

Well, I see planned badly and ran out time, I will stop now, thank you.